Computing inter-rater reliability and its variance in the presence of high agreement.
نویسنده
چکیده
Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC1 coefficient. Also proposed are new variance estimators for the multiple-rater generalized pi and AC1 statistics, whose validity does not depend upon the hypothesis of independence between raters. This is an improvement over existing alternative variances, which depend on the independence assumption. A Monte-Carlo simulation study demonstrates the validity of these variance estimators for confidence interval construction, and confirms the value of AC1 as an improved alternative to existing inter-rater reliability statistics.
منابع مشابه
Test-Retest and Inter-Rater Reliability Study of the Schedule for Oral-Motor Assessment in Persian Children
Objectives: Reliable and valid clinical tools to screen, diagnose, and describe eating functions and dysphagia in children are highly warranted. Today most specialists are aware of the role of assessment scales in the treatment of affected individuals. However, the problem is that the clinical tools used might be nonstandard, and worldwide, there is no integrated assessment performed to assess ...
متن کاملFunctional Movement Screen in Elite Boy Basketball Players: A Reliability Study
Purpose: To investigate the reliability of Functional Movement Screen (FMS) in basketball players. A few studies have compared the reliability of FMS between raters with different experience in athletes. The purpose of this study was to compare the FMS scoring between the beginners and expert raters using video records. Methods: This is a cross-sectional study. The study subjects compris...
متن کاملComparison between inter-rater reliability and inter-rater agreement in performance assessment.
INTRODUCTION Over the years, performance assessment (PA) has been widely employed in medical education, Objective Structured Clinical Examination (OSCE) being an excellent example. Typically, performance assessment involves multiple raters, and therefore, consistency among the scores provided by the auditors is a precondition to ensure the accuracy of the assessment. Inter-rater agreement and i...
متن کاملارزیابی خطرات محیطی با استفاده از پایایی نسخه فارسی شده ابزار غربالگری زمین خوردن و حوادث در منزل در سالمندان ایرانی
Introduction: one of the common problems among older people is falling. Falling inside the houses and streets makes up a large incidence between Iranian elderly, then the effort to identify environmental factors at home and home modification can reduce falls and injury in the elderly. The aim of this study was identifying elderly at risk of fall with using screening tool (HOME FAST) and deter...
متن کاملNurse-Physician Agreement on Triage Category: A Reliability Analysis of Emergency Severity Index
Background and Objectives: MThe Emergency Severity Index (ESI) triage is commonly used in clinical settings to determine the patients’ emergency severity. However, the reliability of this index is not sufficiently explored. The present study examines the inter-rater reliability of ESI by comparing triage ratings as performed by nurses and physicians. Methods: This prospective cross-sectional st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The British journal of mathematical and statistical psychology
دوره 61 Pt 1 شماره
صفحات -
تاریخ انتشار 2008